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OBSERVATIONS ON TWO LATIN VOCABULARY TESTS 



ELSIE GARLAND HOBSON 
Phebe Anna Thome Model School, Bryn Mawr College 



The School Review for December, 1919, listed four sets of 
standard vocabulary tests prepared by V. A. C. Henmon, of the 
University of Wisconsin, H. A. Brown, of the Oshkosh State Normal 
School, Dr. Paul Hanus, of Harvard, and Daniel Starch, of the 
University of Wisconsin, and J. M. Watters. Partly for the sake 
of seeing what light these tests would throw on the pupils' knowledge 
of Latin vocabulary, partly to get some idea, if possible, of the 
validity of the different tests, I arranged with the co-operation of 
the teacher of Latin to give these four sets to four classes in the 
week of February 3-10, 1920. A combination of circumstances 
prevented our giving the Hanus tests, and I have not as yet been 
able to get any standard scores or other information about the 
Brown test. Therefore this discussion is limited to the Henmon 
and the Starch- Watters tests. It may be said that the results of 
the Brown test indicate that it is more akin to those issued by 
Mr. Henmon than to that of Mr. Starch. 

The Henmon and the Starch- Watters tests differ radically in 
composition. The Starch-Watters test comprises one hundred 
words selected, so the test states, "by choosing every twentieth 
word from Lodge's Vocabulary of High School Latin." 1 It is diffi- 
cult to understand just how the words were arranged when this 
count was made. The list given does not include " every twentieth 
word" if one goes straight through the two thousand words of the 
vocabulary in alphabetical order, nor yet if one takes separately the 
vocabularies of the Caesar, Cicero, and Vergil years. 

However, this is perhaps an unimportant question to raise, 
since undoubtedly the words do represent a random selection. The 
groups designated by Lodge as Caesar, Cicero, and Vergil words 

"Gonzalez Lodge, Vocabulary of High School Latin, "Columbia University 
Teachers College Contributions to Education," 1915. 
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are proportionately represented; that is, there are fifty-one from 
the first group which contains one thousand words; twenty-four 
from the second, which contains five hundred words; and twenty- 
five from the third, which also contains five hundred words. Of 
the latter group only three occur in either Caesar or Cicero. This 
gives an opportunity to differentiate between the pupil who has 
studied Vergil and the one who has not. Since 60 per cent of the 
words in the first six books of Vergil do not occur in those portions 
of Caesar or Cicero covered by the Lodge list it is evident that a 
list of words common to all three authors is scarcely adequate to 
test the knowledge of fourth-year pupils. The one hundred words 
of this test occur in high-school Latin from four to five hundred and 
forty-five times each, the median number being fifteen times. 1 
No scale values are assigned to the words. The test is scored only 
by the number of correct meanings given. The standard June 
scores are: first year, thirty-five; second year, fifty; third year, 
sixty-five; fourth year, eighty. There is nothing on the test sheet 
to indicate how the standard scores were arrived at. 

There are five Henmon tests, A, B, C, D, and X. They are 
based on two hundred and thirty-nine words which are common to 
thirteen beginners' books and occur in all three of the writers 
ordinarily read in secondary schools, Caesar, Cicero, and Vergil. 
It is therefore conceivable, though unfortunately not probable, that 
if the tests are given in June, and the first-year pupils have used one 
of these thirteen books, every pupil might make a perfect score, 
a contingency which would be impossible with the Starch-Watters 
test. On the basis of results in nineteen schools (eight hundred and 
forty-seven pupils) each word is assigned a scale value for each 
year and also a general scale value for all years. 2 Thirty-nine 
words are discarded, including some difficult to score and others so 
easy that they were not missed at all by third- and fourth-year 
pupils. The remaining two hundred words are divided into four 
groups, making Tests A, B, C, and D. The sums of the scale values 

1 This count is taken from Lodge's Vocabulary of High School Latin already- 
referred to, as are all other similar statements in this article. This vocabulary covers 
Caesar B.G. i-v; Cicero, In Cat. i-iv, Pro. Arch., De Man. Leg.; Verg. Aen. i-vi. 

3 The Journal of Educational Psychology, VIII, 9, gives a complete account of the 
tests including the method of finding these scale values. 
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of the four groups are approximately the same and the tests are 
presumably of equal difficulty. Test X consists of twenty-five 
words chosen from the foregoing tests. These words are all of so 
nearly equal scale value that the difference is negligible and hence 
may be disregarded. 

Mr. Henmon, in the article in which he describes the tests, 
says of these twenty-five words, "They are not too difficult for 
first-year pupils nor too easy for fourth-year pupils." Noting the 
frequency with which the words occur one can hardly refrain from 
thinking that if the latter part of this statement is true, it indicates 
an unfortunate situation in the Latin classes. Reference to Lodge's 
Vocabulary of High School Latin shows that the two hundred words 
occur from six to one thousand eight hundred and fifty times each 
in those portions of Caesar, Cicero, and Vergil which this vocabu- 
lary covers, with seventy-eight as the median number of occur- 
rences; and that the twenty-five words of Test X occur from twelve 
to one thousand and ninety-six times with a median of sixty-three. 
Evidently mere frequency of occurrence counts for little if there is 
no definite effort to memorize. It is an interesting commentary 
on this fact as well as on the ineffectiveness of our teaching of Latin 
that the determinative pronoun is, which every pupil is supposed 
to learn in his first year and which he sees more than one thousand 
times in the next three years, should have the same scale value 
for fourth-year pupils as cur which occurs twelve times in the three 
upper years, pax which occurs thirty-six times, and accipio which 
occurs eighty-three times. If any words could be eliminated 
"because no third- or fourth-year pupils missed them," one might 
reasonably expect is to be one of that number. It is true that 
pronouns are notoriously difficult but surely not impossible. It 
would seem that the seven hundred and twenty-five occurrences 
in Caesar alone ought to be enough to put this into the "too easy" 
class. One wonders how even the most obtuse pupil manages to 
miss it. The standard June scores in the A, B, C, D tests, in 
percentage correct, are first year, 66; second year, 78; third year, 
88 ; fourth year, 90. As one would expect from the fact that these 
words occur in each of the four years, these standard scores are 
considerably higher than those of the Starch- Watters test. 



512 THE SCHOOL REVIEW [September 

The tests were given on successive days to four classes in the 
following order: February 2, Henmon's A and B; February 3, 
C and D; February 4, X and the Brown test; February 5, Starch- 
Watters. The papers were taken up as soon as they were com- 
pleted, no comment was made on them, and the pupils had no 
reason to suppose that the same words would recur in other tests. 
In fact there were few repetitions except that, as mentioned above, 
Test X is made up of words selected from A, B, C, and D. An 
examination of the papers showed that generally the pupils who 
mistranslated a word at all missed it both times. The classes 
were small, ranging from four to eleven pupils. The results, accord- 
ingly, are not conclusive in themselves. Of the sixty-four classes 
which Mr. Henmon tested, thirty-nine had ten or fewer pupils, 
and are fairly comparable from the standpoint of numbers with the 
classes in this investigation. 

Before giving the results of the tests it may be well to state that 
the organization of the Latin course in the school where these tests 
were made differs radically from that of the average high school. 
The course in Latin covers six years. The usual work of the first 
year is spread over three years in the following way: 

First year 2 half-hours per week, 30 weeks 

Second year 3 half-hours per week, 30 weeks 

Third year 9 half-hours per week, 30 weeks 

making a total of two hundred and ten hours. This includes 
absolutely all the time spent on Latin, study, and recitations. 
Nearly all the work is done under supervision, only two half-hours 
per week in the third year being allowed for home study. This is 
slightly less time than is given to first-year Latin in a school year 
of thirty-five weeks with a forty-minute daily recitation period and 
a study period of the same duration, an average time allowance. 
The other three years follow the usual course except that the amount 
of home study is always small. 

The results of the tests are given in the following tables. Table 
I gives the median score for each class for Tests A, B, C, and D, 
together. It will be remembered that these tests are of practically 
equal difficulty, the total scale value of each being 107-108. 8. 
Table II gives the scores for Test X. The average score is given for 
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the Vergil and Cicero classes because the cases are so few, four in 
each, and the scores so bunched that the term median ceases to 
have any significance. The median is used for the other two classes 
where there are ten or more cases. In Table II are included results 
from the same test secured in June, 191 8, in another school, desig- 
nated "School B." It will be easily seen that they are quite in 

TABLE I 
Henmon Tests A, B, C, D, 50 Words Each 
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Cicero 
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February, 1920 

Standard June score. 
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78 
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TABLE II 

Henmon Test X, 25 Words 
(Scale values are disregarded because approximately equal) 





Vergil 


Cicero 


Caesar 


Beginners' Latin 


Class 


No. of 
Words 
Correct 


Percent- 
age 
Correct 


No. of 
Words 
Correct 


Percent- 
age 
Correct 


No. of 
Words 
Correct 


Percent- 
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Correct 


No. of 
Words 
Correct 


Percent- 
age 
Correct 


School A, February, 1920. 

School B, June, 1918 

Standard June score 


24- S* 
24.2* 
22.6 


98* 

96.8* 

90.4 


23-5* 
24.4* 

22 


94* 

97-6* 

88 


22.5 

23-4* 
18.6 


90 

93.6* 

74-3 


14-5 
21-5 
13-7 


58 
86 
54-8 



* Scores thus marked are averages, the others medians. The actual scores in percentage made 
by the Vergil classes were: School A, roo, roo, 96, 96; School B, 100, 100, 100, 96, 92. Cicero class: 
School A, 100, 96, 96, 84; School B, 100, roo, 100, 96, 92. Caesar class: School B, 100, 100, 96, 92 
(6 pupils ), 88. In most cases the average and median were both computed and were rarely found to differ 
more than two points. 



accord with results obtained this year. Table III gives the median 
scores on the Starch- Watters test for February and also for June 
when this test was given again. Because of the character of the 
scores obtained from the Henmon tests in February it seemed 
hardly worth while to repeat them. It should be noticed that the 
scores on the Henmon tests, except those for School B, are all 
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February scores and that they are compared with the Standard 
June score. 

From an inspection of Tables I and II it seems probable that the 
Henmon tests are too easy to furnish a satisfactory scale. They 
provide means of discriminating neither between the better pupils 
of each class, nor yet between the better pupils of the different years, 
and moreover the standards for the several years appear to be too 
low. This is particularly true of Test X. On this test the median 
of the Caesar class of School A is almost up to the standard of the 
Vergil year. In School B, the average score of the Caesar class 
is above the fourth-year standard, and the median of the beginning 
class is above that of the Caesar year. The comparatively low 
score made by the beginning class in School A does not seem to 



TABLE m 

Starch-Watters Test, ioo Words 



Class 


Vergil 


Cicero 


Caesar 


Beginners' 
Latin 


Median score, February, 1920 


83 
90 

80 


67 
77 
6S 


37-5 

58. S 


25-5 
31 




35 





argue well for the more extended course. Another factor, however, 
should be considered, namely, that the class has not been using 
one of the "thirteen recent or most widely used beginners' books" 
on which the test vocabulary is based. In fact their work has been 
done for the most part without a book, and 25 per cent of the two 
hundred words were entirely unknown to them. On the basis of 
words which they were supposed to know the class made a median 
score of 70 per cent on Tests A, B, C, D, and 80 per cent on Test X. 
These scores are well above the standard. The question may still 
be raised whether the class has as large and well-selected vocabulary 
as it should. This will be considered again in connection with the 
results of the Starch- Watters test. 

The criticism already made as to the lack of range of the Henmon 
tests is confirmed by the consideration of individual scores. In 
School A the Vergil class, out of a possible twenty scores on five 
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tests, made nine perfect scores; the Cicero class, three perfect scores 
out of twenty; the Caesar class, one perfect score out of fifty. It 
is evident that the second-year pupil who makes 100 on this test 
cannot have the same command of Latin vocabulary that the 
fourth-year pupil does who makes the same score. A comparison 
of the scores of four pupils who made 100 on Test X in February 
with their respective scores on the Starch- Watters test at the same 
time shows the advantage of the latter in obtaining a comparative 
rating. 

Pupil No. Henmon X Starch-Waters 

i (Fourth year) 100 91 

2 (Fourth year) 100 85 

3 (Third year) 100 72 

4 (Second year) 100 47 

No one made a perfect score on the Starch- Watters test either in 
February or June. This test evidently gives some opportunity 
for the unusual pupil. 

An inspection of Table III shows the median of the three upper 
classes well above the standard in June but the beginning class 
slightly below. This takes us back to the question already raised, 
whether this class has an adequate vocabulary. Results from the 
other classes which have come up under a similar plan of work 
indicate that there is at least no cause for alarm about their future. 
An examination of their papers reveals very few mistranslations. 
Their knowledge is accurate and dependable as far as it goes. It 
may not be amiss to ask whether it is legitimate to expect a score 
of thirty-five from first-year pupils. On the basis of the two 
thousand words in the Lodge list from which the words of the test 
are selected, this means a knowledge of seven hundred words. 
My own experience and observation in schools where there is a 
definite plan for teaching vocabulary are that it is very difficult for a 
beginning class to learn more than five hundred words and master 
the requisite forms and principles of syntax. On the other hand, 
the third- and fourth-year classes acquire vocabulary very rapidly 
when once they are thoroughly conversant with a rather limited list 
of simple words and with the principles of word building. For this 
reason the Starch-Watters scores for third and fourth year seem 
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low, and the June scores made by School A more nearly what ought 
to be required. 

A comparison of scores on the Starch- Watters test with school 
rating shows that the results of this test indicate fairly well the 
relative rank of pupils. The exceptions fall into two classes. 
The first learn isolated facts easily but do not follow out a logical 
process of thought. In their own favorite phrase, they "know the 
words but cannot put them together." The other group is made 
up of pupils whose school rating reflects rather an attitude of 
persistent endeavor than actual attainment. 

The improvement in the scores from February to June shown in 
Table III is undoubtedly due to the fact that it is the practice of the 
school to devote five minutes a day at certain periods of the year 
to vocabulary drill either taking up words that have been missed 
in prose and reading or else reviewing the vocabulary of the 
year as a whole. This practice has been found to yield valu- 
able results in improvement in sight reading as well as in formal 
vocabulary tests. In the period between February and June the 
Vergil class reviewed Lodge's Vergil list, the Cicero class reviewed 
the Cicero list and about two hundred of the less common words 
from the Caesar list, and the Caesar class reviewed between 
seven hundred and eight hundred of the more common words of 
the Caesar list. 

Quite aside from the matter of comparative scores a vocabulary 
test throws considerable light on the type of mistakes that must be 
guarded against. Such words as otium and odium, concilium and 
consilium, turn and dum, mos and mors, ibi and ubi are perpetual 
pitfalls for the careless and the unwary. Too much care cannot 
be taken to make such words permanent when they are first learned, 
for once a sense of confusion is established, the case is practically 
hopeless. In this same class belong the pronouns quis, aliquis, 
quisquam, quisque, and also hie, ille, and is. The former are 
confused because of the common element quis, the latter because 
of their similarity of meaning. Still more deplorable are the mis- 
takes which show that the pupil has acquired some vague ideas 
but no exact knowledge. To translate difficilis, difficulty, celer, 
swiftly, relinquo, remainder, indicates either hopeless carelessness 
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or a lack of intelligence which should excuse the pupil from further 
work in this field. 

The following conclusions may be drawn from the results 
obtained. The Starch-Watters test is a more satisfactory measure 
than Henmon's tests for a comparison both of classes and of 
individuals in a class. Most of the pupils tested were up to the 
standard score; many were above it. The lowest scores were made 
by the pupils who have just completed the first-year work. The 
fact that this work is done very informally and does not follow the 
usual methods is perhaps a sufficient explanation of this situation. 
It may be that the standard set is too high. At all events the 
record of the three upper classes indicates that the class is not 
likely to be handicapped. 

The acquisition of an adequate vocabulary evidently cannot be 
left to chance nor can we depend on mere frequency of occurrence 
to fix correct meanings in the pupil's mind. Short periods of regular 
and definite drill with perception cards or printed lists show most 
satisfactory results. 



